Search CORE

1 research outputs found

A theoretical and methodological framework for machine learning in survival analysis: Enabling transparent and accessible predictive modelling on right-censored time-to-event data

Author: Sonabend Raphael Edward Benjamin
Publication venue: UCL (University College London)
Publication date: 28/06/2021
Field of study

Survival analysis is an important field of Statistics concerned with mak- ing time-to-event predictions with ‘censored’ data. Machine learning, specifically supervised learning, is the field of Statistics concerned with using state-of-the-art algorithms in order to make predictions on unseen data. This thesis looks at unifying these two fields as current research into the two is still disjoint, with ‘classical survival’ on one side and su- pervised learning (primarily classification and regression) on the other. This PhD aims to improve the quality of machine learning research in survival analysis by focusing on transparency, accessibility, and predic- tive performance in model building and evaluation. This is achieved by examining historic and current proposals and implementations for models and measures (both classical and machine learning) in survival analysis and making novel contributions. In particular this includes: i) a survey of survival models including a crit- ical and technical survey of almost all supervised learning model classes currently utilised in survival, as well as novel adaptations; ii) a survey of evaluation measures for survival models, including key definitions, proofs and theorems for survival scoring rules that had previously been missing from the literature; iii) introduction and formalisation of composition and reduction in survival analysis, with a view on increasing transparency of modelling strategies and improving predictive performance; iv) imple- mentation of several R software packages, in particular mlr3proba for machine learning in survival analysis; and v) the first large-scale bench- mark experiment on right-censored time-to-event data with 24 survival models and 66 datasets. Survival analysis has many important applications in medical statistics, engineering and finance, and as such requires the same level of rigour as other machine learning fields such as regression and classification; this thesis aims to make this clear by describing a framework from prediction and evaluation to implementation

UCL Discovery